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nyPUCATE 
i 

COMPUTER BASED SYSTEM FOR MANIFUlATENG DIGITAL MEDIA 
Technical Field 

This invocation relates to a computet software s y stem for manipulating digital media. 
Background Axt 

5 Application software for editing digital video is a? extremely sophisticated and powerful tool 
because it is primarily desigjied for, and sold to, the video professional Such an individual 
requites access to many complex functions and is prepared to invest time and effort in 
learning to become skilled in their use, Historically, the terminology and conventions o£ 
Digital Editing have evolved, from ft traditional Sim editing environment whete rushes Ate cut 
ID and spliced together to tell a story or follow a script. As digital mixer technology advanced 
new techniques were combined with these concvcjatioaal methods tp form the early 
pioneering software based digital editors. 

To the video or film professional editing is second nature and the complexities of a time- 
based media go unnoticed since, having already grasped concepts and learned processes, they • 
15 are able to concentrate on the nuances of different editing packages, of which there arc 
many* 

Conventionally these packages, through the use of a Graphical User Interface (GUI), attempt 
to provide an abstraction of the media in terms of many separate tracks of video and audio, 
These are represented on the output device in symbolic fashion and provision is made for 

20 interacting with these representations using an input device such as a mouse. Typically the 
purpose is to create a new piece of rnretia as an output file, composed by assembling clips or 
segments of video and audio along a timeline that represents the temporal ordering of 
frames. Special, effects such as wipes and fades can be incorporated, transparent overlays can 
be added, colour and contrast can be adjusted The list of manipulations made possible by 

25 such tools is very long indeed, A typical system is described in, for example, Foreman; Kevin 
J*, ct $JL "Graphical user interface fiat a video editing system", U.S. Patent. 6,469,711. 

It is possible, however, that an individual who is a consumer of media, rather thaa a 
producer, may need to perform a simple editing operation on a media file in order to 
accomplish their primary tasks for example to give a multi-media presentation In this case 
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such took have their drawbacks. They may be too expensive to justify Individually, or to 
have enough of in order to be available when or where needed The limited amount of use 
and the small fraction of the capabilities used in such situations may make them uneconomic 
The steep teaming curve associated with such toe-la may mean mat an inappropriate amount 
of effort is expended on something-that is not the primary occupation or concern of the tool 
user. For occasional or infrequent use there wnl be reluctance on the part of any user 
repeatedly to switch environments or learn and ttleam ttCW tools to perform simple last 
minute tanks. 

This situation parallels previous well-known situations where improvements in the 
availability, usability and price/performance ratio of consumer IT equipment, has caused a 
significant reappraisal of what is possible and a change in behaviour to exploit acw 
possibilities. For example, the production of bigh-aualiry printed documents was once the 
province of highly skilled people using expensive and specialised equipment Now anybody 
with a need to produce such a document, who has access to a computer and a word- 
13 processing program, can do so. A similar shift in paradigm is happening in Digital Video 
Editing, where there is a need for highly accessible and usable tools that focus on the needs 
of a new generation of user, and not rwcessarify try to recreate me feel of a traditional video 
editing environment 

Such tools must exhibit an intuitive, predictable and consistent behaviour as understood by a 
20 new generation of digital media professionals who may well be extremely familiar with the 
naanipularion of documents of various Wads through a computer's GUI, but be completely 
unfomffiar with the characteristics of time-based media. The tools need not supercede long 
established and specialised tools used by trained professionals hut, rather, provide a bridge in 
order chat new users may be as comfortable working with time based media as they ate 
25 working with documents. 

Conventionally, video editors axe structured as specialised •monolithic' applications. Current 
software technology, however, is well capable of adding sophisticated editing functions to 
wuctaed applications through the use of software 'plug-ins'. The Microsoft® DirectShow® 
Editing Services is an application programming interface (APT) that is built oa top of 
30 Microsoft® DirectShow® that allows video editing capabilities to be added to applications. 
In this example filters, implemented as Common Object Module components, are created 
and inter-connccted to form filter graphs. As another example the QuickTune track based 



t00655^l:;:07.--Apr903,»i1fTWS| 



wr-HJ-K-cMias t-KUFi: UK1GIN LONDON +44-3072090643 TO: +01633 014444 P.010'015 




3 

architecture is the foundation of many modern day editors such as Adobe Premiere, It offers 
embedded API based access, resident below the application layer that provides for simple 
track manipulation. There is no reason why such pLug-ins may not be deployed, in 
applications that are not, primarily, designed for "video editing. 

Accordingly, these are the Attributes of a tnol that is more appropriate to the needs of such a 
consumer. i 

1. Simple and intuitfcne to use; in parricular,, Httk time and effort h required to learn 
enough to accomplish the task in hand. 

2. Terminology and workflow consistent vmix a shift in convention towards action led 
digital media editing; eg, 'Jog* to select as with modern VCR's and a simple crop 
ability to trim the running length of a piece of media, 

3. Available whenever and wherever needed, even if the user did not foresee the need 
for such a tool until that need cropped up, ie., the edit capability is provided as an 
intrinsic part of the environment by way of any player of the media. 

4. Provides a consistent interface to the user irrespective of the type of 'container' 
application it is associated with. It would look exactly the same whethet incorporated 
into an electronic text document, a spreadsheet, or a slide presentation, 

5. Persistence of modifications; e.g. the user opens media object, or a document 
containing an embedded media object and expects any changes they make to persist 
between sessions. 

Disclosure of Invention 

The invention relates to a method for adding the capability for media-manipulation to 
software media players, such that this capability is intrinsic to the media player, and which 
25 comprises a set of tools that ensure that consistent behavioural, visual and fiinctional aspects 
arc maintained between media pJayer applications. 

Briefly, the invention works as follows. 

According 1X3 one aspect of the invention a Graphical "Use* Interface (GUI) for editing is 
implemented. 



1C 
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ifome preferred embodiment of thk aspect of the invention a plug-in modulo » loaded into 
the computer's memory to provide me specific functionality quired. This software module 
has interfaces to a media delivery subsystem, such as the Microsoft® DirectShow® 
architecture for me Microsoft® Endows® platform that provides services for streaming, 
5 buffering, synchronisation, decoding and rendering of video and audio. Media is streamed 
into a local cache that provides for fine-grain scrubbing W and 'looping' of short sections 
around in and out points. A set of msttuedons la devised for each piece of media and its 
interaction with a timeline. Specific elements are constructed in memory to process these 

instructions and subsequently handle the media in a suitable form, as compatible wim the 
,0 media play architecture in operation. New and modified elements maybo constructed and 

reconstructed as required, each element may process but is not limited to a single set of 

jbsmictions or piece of media. 

UK functionality provided by this software module consists ofi- 

« Graphics tendering to allow combination and/or overlay of graphical data for the GUI 
15 with pixels mat are decoded from the video pare of the media file and rendered into the 
video area on the screen. 

« A cache for portions of the media file in the memory of die client machine. 

20 • A state machine, whose transitions guide a user through a sequence of rateracdoas with a 
graphical user interface (GUI). 

• Graphics that implements visual feedback of the current state to me user. 



25 



30 



Graphics that implements a visual metaphor that provides the user with an jnfafite 
understanding of the operation of the interface, 

An exporter forme persistence of the chosen manipulations, for example; to "Save" the 
processed media to memory creating a new media object or create a new set of 
instructions that describes the precise operation required to effect me manipulations for 
playback. Including but not limited to, such instruction as references to sections within a 
remote piece of media. 
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• Graphics that allows labels of various types to be added to significant parts of the media 
file in order to identify them a? such. 

5 • Graphics that allows the definition of Actions to be takes when significant parts of the 
media file are encountered in normal playback 

In this embodiment of this aspect of the invention the GUI is provided by modules within 
the software fcamework that Implements the media, player, by the addition of visible user 
10 interface components (buttons, test boxes, etc) cither overlaid or actually burnt into the 
rendered video window (i.e. the pixels written to the ftamestore by the video renderer are 
overwritten). In the Windows Media® architecture, where software filter graph components 
are linked together to impLemcnt a media pkycr, this functionality may be added into a video 
renderer filter or Em ovcriay filter, 

15 In another embodiment of this aspect of the invention the GUI is provided by software 
modules, other than those embedded within the media player foamcworX such as ActiveX 
controls. 

According to another aspect of the invention elements are exchanged between instances of a 
media player. 

20 In the preferred embodiment of this aspect of the invention the Windows® Media 
environment is employed euch that one instance of the player may be used to manage the 
faster*" timeline, while another allows clips to be trimmed to the desired length and then 
dragged and dropped into the "master" player instance. At this time the recipient instance 
may chose to combine the filter graph for the new piece of media with those already in 

25 existence, or it may chose to reconstruct a new filter graph based on the complexity and 
required interaction of the current timeline objects. 

According to another Aspect of the invention a process flow is provided that provides for 
untwined users to achieve their goal with minimum efiort, and distraction ftom their primary 
task. 

30 In the preferred embodiment of this aspect of the invention state ma c hines help walk the 
users through operations to avoid mistakes and distil the complexity of editing into bounded 
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and easy to understand processes. W and tactile feedback will provide rapid confidence 
in the task and aid progress; eg. to slim down a media object the user will select a -S«* 
Here" in point and guided inwards a "Stop Here" out point. 

Effect confirmation methods are employed to inform and protect the actions of the user, 
5 visual metaphors W* be provided from lie embedded editor level to identify nodes of the 
craem 9W re machine. For example, the video window may show a filmstrip with the cwffl* 
fiame highlighted subsequent frames normal and previous frames, Le. those cropped, with a 
strike out marker. 

According to another aspect of the invention metadata in the media file is recognised by the 
10 system and used as a stream of control ^formation that is used to assist editing operations. 
In one embodiment of this aspect of the invention the meta-data may include but is not 
limited to: 

• Timecode, 

• Closed caption 

15 • Edit points used during the creation of the media, 

• Format-dependent properties such as GOP boundaries in MPEG, 

• Data generated as a result of post-processing such as shot change information. 

The control ^formation identifies significant points in the media and triggers events that 
20 cause instructional or informative inforroarion to be displayed For example dialogue bo** 
may pop up during playback with labels such as "Start Here" (EN) or "Stop Here" (OUT} . 



25 



Industrial Applicability 

As a simple example of the use of the invention consider this scenario. An education 
professional is preparing materials for a lecture they are about to give. A part of mis is an 
ej^tronic slide presentation with some of the slides containing embedded medk objects. On 
zoning through the presentation they realise that it over-runs the time allowance if all the 
jKcdk clips ^ played in their entirety. Using the system described here the media cm be 
qnickly and efficiently trimmed to suit tenements, without the need for switching 
environments or af>plicatioo&» 
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CXAIMS 

1. A computet based system fat manipulating digital media, die system adding me capability 
for mfidiarmanipuJation to software media payers, such that this capability is mtrlnsic to 

5 the media player, and which comprises a set of tools that ensure that consistent 

behavioural, visual and functional aspects ate maintained between media player based 
applications, 

2. The system of claim 1 where media indndes, bur is not confined to, video and audio. 

3. The system of claims 1 & 2 where manipulation includes, but is not confined to. the 
10 operarions of one or more on 

{a) Editing; ramming; annotating; effects; transitions; appearance; presentation.; 

4. The system of claims 1-3 comprising one or more ofe 

(a) A software component that implements a cache for portions of the media file in the 
memory of the client machine. 
15 (b) A software component that implements a process equivalent to a state machine, 

whose transitions guide a user through a sequence of infractions with a graphical user 
interface (GUI). 

(c) A software graphics component of the GUI, that implements visual feedback to the 
user of the current state. 

20 (d) A software graphics component of the GUI that implements a visual metaphor that 

provides the user with an intoithre understanding of the operation of the interface. 

(e) A software graphics tenderer component that allows combination and/ or overlay of 
graphical data for the GUI with pixels that are decoded from the video part of the media 
file and rendered into the video area on the screen. 
25 (OA software component that implements an export of the processed media to 
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(g) A software component that ample tnenta the ability to read ft description file(s) and 
construct playback in accordance with set injections, or write such instructions from a 
current playback 

(h) A software component of the GUI that allows labels or triggers of various types to 
5 be added to significant parts of the media file in order to identify them as such. 

S. Hie system of claims 1-4 where labels Include, but ate not confined to, edit in 2nd out- 
points* shot boundaries, Group-Of-Picture (GOP) boundaries, closed-caption text and 
dmecode. 

6- The system of claims 1-5 where the triggers include, bat axe not confined to, initiate 
10 pop-Hps, hold frames for a given duration, loop and naessagtog* 

7. The system of claims 1-5 additionally comprising: 

A software decoder component that maps meta-data contained in the media file to 
labels, where the mcta-data includes, bur is not confined to, shot bouad&des, Group-Of- 
Ficture (GOP) boundaries, closcd-captioA and dmecode. 

15 8. The system of claims 1-7 additionally comprising: 

A software agent component that maps aspects of the interactive behaviour of the user 
into configuration information that may modify aspects of the behaviour of tiie GUI. 

9. The system of claims 1-8 additionally comprising : 

20 A media file that may optionally be selected and played by the user, which, provides 

instruction in the use of the GUL 
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